Goto

Collaborating Authors

 Artemisa Province


Do Prompts Reshape Representations? An Empirical Study of Prompting Effects on Embeddings

Gonzalez-Gutierrez, Cesar, Hovy, Dirk

arXiv.org Artificial Intelligence

Prompting is a common approach for leveraging LMs in zero-shot settings. However, the underlying mechanisms that enable LMs to perform diverse tasks without task-specific supervision remain poorly understood. Studying the relationship between prompting and the quality of internal representations can shed light on how pre-trained embeddings may support in-context task solving. In this empirical study, we conduct a series of probing experiments on prompt embeddings, analyzing various combinations of prompt templates for zero-shot classification. Our findings show that while prompting affects the quality of representations, these changes do not consistently correlate with the relevance of the prompts to the target task. This result challenges the assumption that more relevant prompts necessarily lead to better representations. We further analyze potential factors that may contribute to this unexpected behavior.


TrackFormers Part 2: Enhanced Transformer-Based Models for High-Energy Physics Track Reconstruction

Caron, Sascha, Dobreva, Nadezhda, Kimpel, Maarten, Odyurt, Uraz, Pshenov, Slav, Bazan, Roberto Ruiz de Austri, Shalugin, Eugene, Wolffs, Zef, Zhao, Yue

arXiv.org Artificial Intelligence

High-Energy Physics experiments are rapidly escalating in generated data volume, a trend that will intensify with the upcoming High-Luminosity LHC upgrade. This surge in data necessitates critical revisions across the data processing pipeline, with particle track reconstruction being a prime candidate for improvement. In our previous work, we introduced "TrackFormers", a collection of Transformer-based one-shot encoder-only models that effectively associate hits with expected tracks. In this study, we extend our earlier efforts by incorporating loss functions that account for inter-hit correlations, conducting detailed investigations into (various) Transformer attention mechanisms, and a study on the reconstruction of higher-level objects. Furthermore we discuss new datasets that allow the training on hit level for a range of physics processes. These developments collectively aim to boost both the accuracy, and potentially the efficiency of our tracking models, offering a robust solution to meet the demands of next-generation high-energy physics experiments.


Domain Pre-training Impact on Representations

Gonzalez-Gutierrez, Cesar, Quattoni, Ariadna

arXiv.org Artificial Intelligence

This empirical study analyzes the effects of the pre-training corpus on the quality of learned transformer representations. We focus on the representation quality induced solely through pre-training. Our experiments show that pre-training on a small, specialized corpus can yield effective representations, and that the success of combining a generic and a specialized corpus depends on the distributional similarity between the target task and the specialized corpus.


SKDU at De-Factify 4.0: Natural Language Features for AI-Generated Text-Detection

Malviya, Shrikant, Arnau-González, Pablo, Arevalillo-Herráez, Miguel, Katsigiannis, Stamos

arXiv.org Artificial Intelligence

The rapid advancement of large language models (LLMs) has introduced new challenges in distinguishing human-written text from AI-generated content. In this work, we explored a pipelined approach for AI-generated text detection that includes a feature extraction step (i.e. prompt-based rewriting features inspired by RAIDAR and content-based features derived from the NELA toolkit) followed by a classification module. Comprehensive experiments were conducted on the Defactify4.0 dataset, evaluating two tasks: binary classification to differentiate human-written and AI-generated text, and multi-class classification to identify the specific generative model used to generate the input text. Our findings reveal that NELA features significantly outperform RAIDAR features in both tasks, demonstrating their ability to capture nuanced linguistic, stylistic, and content-based differences. Combining RAIDAR and NELA features provided minimal improvement, highlighting the redundancy introduced by less discriminative features. Among the classifiers tested, XGBoost emerged as the most effective, leveraging the rich feature sets to achieve high accuracy and generalisation.


NRSurNN3dq4: A Deep Learning Powered Numerical Relativity Surrogate for Binary Black Hole Waveforms

Freitas, Osvaldo Gramaxo, Theodoropoulos, Anastasios, Villanueva, Nino, Fernandes, Tiago, Nunes, Solange, Font, José A., Onofre, Antonio, Torres-Forné, Alejandro, Martin-Guerrero, José D.

arXiv.org Artificial Intelligence

Gravitational wave approximants are widely used tools in gravitational-wave astronomy. They allow for dense coverage of the parameter space of binary black hole (BBH) mergers for purposes of parameter inference, or, more generally, match filtering tasks, while avoiding the computationally expensive full evolution of numerical relativity simulations. However, this comes at a slight cost in terms of accuracy when compared to numerical relativity waveforms, depending on the approach. One way to minimize this is by constructing so-called~\textit{surrogate models} which, instead of using approximate physics or phenomenological formulae, rather interpolate within the space of numerical relativity waveforms. In this work, we introduce~\texttt{NRSurNN3dq4}, a surrogate model for non-precessing BBH merger waveforms powered by neural networks. By relying on the power of deep learning, this approximant is remarkably fast and competitively accurate, as it can generate millions of waveforms in a tenth of a second, while mismatches with numerical relativity waveforms are restrained below $10^{-3}$. We implement this approximant within the~\textsc{bilby} framework for gravitational-wave parameter inference, and show that it it is suitable for parameter estimation tasks.


A comparison of Bayesian sampling algorithms for high-dimensional particle physics and cosmology applications

Albert, Joshua, Balazs, Csaba, Fowlie, Andrew, Handley, Will, Hunt-Smith, Nicholas, de Austri, Roberto Ruiz, White, Martin

arXiv.org Machine Learning

For several decades now, Bayesian inference techniques have been applied to theories of particle physics, cosmology and astrophysics to obtain the probability density functions of their free parameters. In this study, we review and compare a wide range of Markov Chain Monte Carlo (MCMC) and nested sampling techniques to determine their relative efficacy on functions that resemble those encountered most frequently in the particle astrophysics literature. Our first series of tests explores a series of high-dimensional analytic test functions that exemplify particular challenges, for example highly multimodal posteriors or posteriors with curving degeneracies. We then investigate two real physics examples, the first being a global fit of the $\Lambda$CDM model using cosmic microwave background data from the Planck experiment, and the second being a global fit of the Minimal Supersymmetric Standard Model using a wide variety of collider and astrophysics data. We show that several examples widely thought to be most easily solved using nested sampling approaches can in fact be more efficiently solved using modern MCMC algorithms, but the details of the implementation matter. Furthermore, we also provide a series of useful insights for practitioners of particle astrophysics and cosmology.


Towards AI-Safety-by-Design: A Taxonomy of Runtime Guardrails in Foundation Model based Systems

Shamsujjoha, Md, Lu, Qinghua, Zhao, Dehai, Zhu, Liming

arXiv.org Artificial Intelligence

The rapid advancement and widespread deployment of foundation model (FM) based systems have revolutionized numerous applications across various domains. However, the fast-growing capabilities and autonomy have also raised significant concerns about responsible AI and AI safety. Recently, there have been increasing attention toward implementing guardrails to ensure the runtime behavior of FM-based systems is safe and responsible. Given the early stage of FMs and their applications (such as agents), the design of guardrails have not yet been systematically studied. It remains underexplored which software qualities should be considered when designing guardrails and how these qualities can be ensured from a software architecture perspective. Therefore, in this paper, we present a taxonomy for guardrails to classify and compare the characteristics and design options of guardrails. Our taxonomy is organized into three main categories: the motivation behind adopting runtime guardrails, the quality attributes to consider, and the design options available. This taxonomy provides structured and concrete guidance for making architectural design decisions when designing guardrails and highlights trade-offs arising from the design decisions.


Invariance of deep image quality metrics to affine transformations

Alabau-Bosque, Nuria, Daudén-Oliver, Paula, Vila-Tomás, Jorge, Laparra, Valero, Malo, Jesús

arXiv.org Artificial Intelligence

Deep architectures are the current state-of-the-art in predicting subjective image quality. Usually, these models are evaluated according to their ability to correlate with human opinion in databases with a range of distortions that may appear in digital media. However, these oversee affine transformations which may represent better the changes in the images actually happening in natural conditions. Humans can be particularly invariant to these natural transformations, as opposed to the digital ones. In this work, we evaluate state-of-the-art deep image quality metrics by assessing their invariance to affine transformations, specifically: rotation, translation, scaling, and changes in spectral illumination. Here invariance of a metric refers to the fact that certain distances should be neglected (considered to be zero) if their values are below a threshold. This is what we call invisibility threshold of a metric. We propose a methodology to assign such invisibility thresholds for any perceptual metric. This methodology involves transformations to a distance space common to any metric, and psychophysical measurements of thresholds in this common space. By doing so, we allow the analyzed metrics to be directly comparable with actual human thresholds. We find that none of the state-of-the-art metrics shows human-like results under this strong test based on invisibility thresholds. This means that tuning the models exclusively to predict the visibility of generic distortions may disregard other properties of human vision as for instance invariances or invisibility thresholds.


Image Segmentation via Divisive Normalization: dealing with environmental diversity

Hernández-Cámara, Pablo, Vila-Tomás, Jorge, Dauden-Oliver, Paula, Alabau-Bosque, Nuria, Laparra, Valero, Malo, Jesús

arXiv.org Artificial Intelligence

Autonomous driving is a challenging scenario for image segmentation due to the presence of uncontrolled environmental conditions and the eventually catastrophic consequences of failures. Previous work suggested that a biologically motivated computation, the so-called Divisive Normalization, could be useful to deal with image variability, but its effects have not been systematically studied over different data sources and environmental factors. Here we put segmentation U-nets augmented with Divisive Normalization to work far from training conditions to find where this adaptation is more critical. We categorize the scenes according to their radiance level and dynamic range (day/night), and according to their achromatic/chromatic contrasts. We also consider video game (synthetic) images to broaden the range of environments. We check the performance in the extreme percentiles of such categorization. Then, we push the limits further by artificially modifying the images in perceptually/environmentally relevant dimensions: luminance, contrasts and spectral radiance. Results show that neural networks with Divisive Normalization get better results in all the scenarios and their performance remains more stable with regard to the considered environmental factors and nature of the source. Finally, we explain the improvements in segmentation performance in two ways: (1) by quantifying the invariance of the responses that incorporate Divisive Normalization, and (2) by illustrating the adaptive nonlinearity of the different layers that depends on the local activity.


Novel Approaches for ML-Assisted Particle Track Reconstruction and Hit Clustering

Odyurt, Uraz, Dobreva, Nadezhda, Wolffs, Zef, Zhao, Yue, Sánchez, Antonio Ferrer, Bazan, Roberto Ruiz de Austri, Martín-Guerrero, José D., Varbanescu, Ana-Lucia, Caron, Sascha

arXiv.org Artificial Intelligence

Track reconstruction is a vital aspect of High-Energy Physics (HEP) and plays a critical role in major experiments. In this study, we delve into unexplored avenues for particle track reconstruction and hit clustering. Firstly, we enhance the algorithmic design effort by utilising a simplified simulator (REDVID) to generate training data that is specifically composed for simplicity. We demonstrate the effectiveness of this data in guiding the development of optimal network architectures. Additionally, we investigate the application of image segmentation networks for this task, exploring their potential for accurate track reconstruction. Moreover, we approach the task from a different perspective by treating it as a hit sequence to track sequence translation problem. Specifically, we explore the utilisation of Transformer architectures for tracking purposes. Our preliminary findings are covered in detail. By considering this novel approach, we aim to uncover new insights and potential advancements in track reconstruction. This research sheds light on previously unexplored methods and provides valuable insights for the field of particle track reconstruction and hit clustering in HEP.